AITopics | regular splitting

Collaborating Authors

regular splitting

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Optimal control of robust team stochastic games

Huang, Feng, Cao, Ming, Wang, Long

arXiv.org Artificial IntelligenceMay-16-2021

In stochastic dynamic environments, team stochastic games have emerged as a versatile paradigm for studying sequential decision-making problems of fully cooperative multi-agent systems. However, the optimality of the derived policies is usually sensitive to the model parameters, which are typically unknown and required to be estimated from noisy data in practice. To mitigate the sensitivity of the optimal policy to these uncertain parameters, in this paper, we propose a model of "robust" team stochastic games, where players utilize a robust optimization approach to make decisions. This model extends team stochastic games to the scenario of incomplete information and meanwhile provides an alternative solution concept of robust team optimality. To seek such a solution, we develop a learning algorithm in the form of a Gauss-Seidel modified policy iteration and prove its convergence. This algorithm, compared with robust dynamic programming, not only possesses a faster convergence rate, but also allows for using approximation calculations to alleviate the curse of dimensionality. Moreover, some numerical simulations are presented to demonstrate the effectiveness of the algorithm by generalizing the game model of social dilemmas to sequential robust scenarios.

algorithm, regular splitting, stochastic game, (14 more...)

arXiv.org Artificial Intelligence

2105.07405

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Netherlands (0.04)
Asia > China > Beijing > Beijing (0.04)
(2 more...)

Genre: Research Report (0.81)

Industry: Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

TDprop: Does Jacobi Preconditioning Help Temporal Difference Learning?

Romoff, Joshua, Henderson, Peter, Kanaa, David, Bengio, Emmanuel, Touati, Ahmed, Bacon, Pierre-Luc, Pineau, Joelle

arXiv.org Machine LearningJul-6-2020

We investigate whether Jacobi preconditioning, accounting for the bootstrap term in temporal difference (TD) learning, can help boost performance of adaptive optimizers. Our method, TDprop, computes a per parameter learning rate based on the diagonal preconditioning of the TD update rule. We show how this can be used in both $n$-step returns and TD($\lambda$). Our theoretical findings demonstrate that including this additional preconditioning information is, surprisingly, comparable to normal semi-gradient TD if the optimal learning rate is found for both via a hyperparameter search. In Deep RL experiments using Expected SARSA, TDprop meets or exceeds the performance of Adam in all tested games under near-optimal learning rates, but a well-tuned SGD can yield similar improvements -- matching our theory. Our findings suggest that Jacobi preconditioning may improve upon typical adaptive optimization methods in Deep RL, but despite incorporating additional information from the TD bootstrap term, may not always be better than SGD.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Machine Learning

2007.02786

Country:

North America > Canada > Quebec > Montreal (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Russia (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Education (0.68)
Leisure & Entertainment > Games (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Matrix Splitting Perspective on Planning with Options

Bacon, Pierre-Luc, Precup, Doina

arXiv.org Artificial IntelligenceJul-10-2017

We show that the Bellman operator underlying the options framework leads to a matrix splitting, an approach traditionally used to speed up convergence of iterative solvers for large linear systems of equations. Based on standard comparison theorems for matrix splittings, we then show how the asymptotic rate of convergence varies as a function of the inherent timescales of the options. This new perspective highlights a trade-off between asymptotic performance and the cost of computation associated with building a good set of options.

artificial intelligence, machine learning, splitting, (17 more...)

arXiv.org Artificial Intelligence

1612.00916

Country: North America > United States > California > San Francisco County > San Francisco (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)

Add feedback